1 research outputs found

    Synchronization of streamed audio between multiple playback devices over an unmanaged IP network

    Get PDF
    When designing and implementing a prototype supporting inter-destination media synchronization – synchronized playback between multiple devices receiving the same stream – there are a lot of aspects that need to be considered, especially when working with unmanaged networks. Not only is a proper streaming protocol essential, but also a way to obtain and maintain the synchronization of the clocks of the devices. The thesis had a few constraints, namely that the server producing the stream should be written for the .NET-platform and that the clients receiving it should be using the media framework GStreamer. This framework provides methods for both achieving synchronization as well as resynchronization. As the provided resynchro- nization methods introduced distortions in the audio, an alternative method was implemented. This method focused on minimizing the distortions, thus maintain- ing a smooth playback. After the prototype had been implemented, it was tested to see how well it performed under the influence of packet loss and delay. The accuracy of the synchronization was also tested under optimal conditions using two different time synchronization protocols. What could be concluded from this was that a good synchronization could be maintained on unloaded networks using the proposed method, but when introducing delay the prototype struggled more. This was mainly due to the usage of the Network Time Protocol (NTP), which is known to perform badly on networks with asymmetric paths.When working with synchronized playback it is not enough just obtain- ing it – it also needs to be maintained. Implementing a prototype thus involves many parts ranging from choosing a proper streaming protocol, to handling glitch free resynchronization of audio. Synchronization between multiple speakers has a wide area of application, ranging from home entertainment solutions to big malls where announcements should appear synchronized over the entire perimeter. In order to achieve this, two main parts are involved: the streaming of the audio, and the actual synchronization. The streaming itself poses problems mostly since the prototype should not only work on dedicated networks, but rather on all kinds, such as the Internet. As the information over these networks are transmitted in packets, and the path from source to destination crosses many sub networks, the packets may be delayed or even lost. This may create an audible distortion in the playback. The next part is the synchronization. This is most easily achieved by putting a time on each packet stating when in the future it should be played out. If then all receivers play it back at the specified time, synchronization is achieved. This however requires that all the receivers share the idea of when a specific time is – the clocks at all the receivers must be synchronized. By using existing software and hardware solutions, such as the Network Time Protocol (NTP) or the Precision Time Protocol (PTP), this can be accomplished. The accuracy of the synchronization is therefore partly dependent on how well these solutions work. Another valid aspect is how accurate the synchronization must be for the sound to be perceived as synchronized by humans. This is usually in the range of a few tens of milliseconds to five milliseconds depending on the sound. When a global time has been distributed to all receivers, matters get more complicated as there is more than one clock to consider at each receiver. Apart from the previously mentioned clock, now called the ’system clock’, there is also an audio clock, which is a hardware clock positioned on the sound card. This audio clock decides the rate at which media is played out. Altering the system clock to synchronize it to a common time is one thing, but altering the audio clock while media is being played will inevitably mean a jump in the playback, and thus a distortion. Although an initial synchronization can be achieved, the two clocks will over time tick in slightly different pace, thus drifting away from each other. This creates a need for the audio clock to continuously correct itself to follow the system clock. In the media framework GStreamer, used for handling the media at the re- ceivers, two alternatives to solve the correction problem were available. Quick evaluations of these two methods however showed that either audible glitches or ’oscillations’ occurred in the sound, when the clocks were corrected. A new method, which basically combines the two existing, was therefore implemented. With this method the audio clock is continuously corrected, but in a smaller and less aggressive way. Listening tests revealed much smaller, often not audible, distortions, while the synchronization performance was at par with the existing methods. More thorough testing showed that the synchronization over networks with light traffic was in the microsecond-range, thus far below the threshold of what will appear as synchronized. During worse conditions – simulated hostile environments – the synchronization quickly reached unacceptable levels though. This was due to the previously mentioned NTP, and not the implemented method on the other hand
    corecore